{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Distill SFT"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "# 国内需要禁止统计，否则会卡在模型加载的地方（连不到外网）\n",
    "os.environ[\"UNSLOTH_DISABLE_STATISTICS\"] = \"0\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n",
      "Unsloth: Failed to patch Gemma3ForConditionalGeneration.\n",
      "🦥 Unsloth Zoo will now patch everything to make training faster!\n",
      "INFO 04-22 14:01:30 [__init__.py:239] Automatically detected platform cuda.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Sliding Window Attention is enabled but not implemented for `eager`; unexpected results may be encountered.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "==((====))==  Unsloth 2025.3.19: Fast Qwen2 patching. Transformers: 4.51.3. vLLM: 0.8.4.\n",
      "   \\\\   /|    NVIDIA GeForce RTX 4090. Num GPUs = 1. Max memory: 22.159 GB. Platform: Linux.\n",
      "O^O/ \\_/ \\    Torch: 2.6.0+cu124. CUDA: 8.9. CUDA Toolkit: 12.4. Triton: 3.2.0\n",
      "\\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post2. FA2 = False]\n",
      " \"-____-\"     Free license: http://github.com/unslothai/unsloth\n",
      "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Unsloth 2025.3.19 patched 28 layers with 28 QKV layers, 28 O layers and 28 MLP layers.\n"
     ]
    }
   ],
   "source": [
    "from unsloth import FastLanguageModel, is_bfloat16_supported\n",
    "import torch\n",
    "\n",
    "max_seq_length = 8192 # Can increase for longer reasoning traces\n",
    "lora_rank = 64 # Larger rank = smarter, but slower\n",
    "\n",
    "model, tokenizer = FastLanguageModel.from_pretrained(\n",
    "    model_name = \"/data/countdown/output/models/Qwen2.5-1.5B-Instruct\", # change to your model path\n",
    "    max_seq_length = max_seq_length,\n",
    "    load_in_4bit = False, # False for LoRA 16bit\n",
    "    local_files_only=True,\n",
    "    #fast_inference = True, # Enable vLLM fast inference\n",
    "    max_lora_rank = lora_rank,\n",
    "    #gpu_memory_utilization = 0.5, # Reduce if out of memory\n",
    ")\n",
    "\n",
    "model = FastLanguageModel.get_peft_model(\n",
    "    model,\n",
    "    r = lora_rank, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128\n",
    "    target_modules = [\n",
    "        \"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n",
    "        \"gate_proj\", \"up_proj\", \"down_proj\",\n",
    "    ], # Remove QKVO if out of memory\n",
    "    lora_alpha = lora_rank * 2,\n",
    "    use_gradient_checkpointing = \"unsloth\", # Enable long context finetuning\n",
    "    random_state = 3407,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 可以使用训练中定义的，也可以使用模型自带的 chat template\n",
    "from unsloth.chat_templates import get_chat_template\n",
    "tokenizer = get_chat_template(\n",
    "   tokenizer,\n",
    "   chat_template = \"qwen-2.5\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "15a5d51805314b9f834d80b4462a3de8",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Generating train split: 0 examples [00:00, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "dataset = load_dataset(\"json\", data_files=\"data/train_sft_distill.jsonl\")[\"train\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "340831ea56034a279c64a0b66ead69d1",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Map:   0%|          | 0/712 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def apply_chat_template(examples):\n",
    "    texts = tokenizer.apply_chat_template(examples[\"messages\"])\n",
    "    texts = [tokenizer.decode(text) for text in texts]\n",
    "    return { \"text\" : texts }\n",
    "\n",
    "dataset = dataset.map(apply_chat_template, batched = True, load_from_cache_file=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<|im_start|>system\n",
      "You are a helpful assistant. You first thinks about the reasoning process in the mind and then provides the user with the answer.<|im_end|>\n",
      "<|im_start|>user\n",
      "Using the numbers 30, 24, 10, 35, create an equation that equals 43. You can use basic arithmetic operations (+, -, *, /) one or multiple times but each number can only be used once, and you must use all the numbers. Show your work in <think> </think> tags. And return the final equation in <answer> </answer> tags, for example <answer>(1 + 2) / 3</answer>. Think step by step inside <think> tags.<|im_end|>\n",
      "<|im_start|>assistant\n",
      "<think>Okay, let's see. I need to make an equation using all the numbers 30, 24, 10, and 35 exactly once, combining them with basic arithmetic operations (+, -, *, /) to get 43. Hmm, let's start by thinking about possible combinations.\n",
      "\n",
      "First, 43 is a prime number, so maybe addition or subtraction could be part of the solution. Let's look at the numbers: 30, 24, 10, 35. The largest number is 35, so maybe if I subtract something from 35 to get closer to 43. Wait, 35 is smaller than 43, so that's not helpful. Alternatively, adding 35 and 10 gives 45, which is close to 43. Then maybe subtract 2? But I don't have a 2. Hmm. Alternatively, 35 + (something) =43. So 43-35=8. So I need to make 8 with 30, 24, and 10. But that's three numbers, and I can only use each once. Let's see: 30, 24, 10. How can I combine these to get 8? 30 - 24 =6, then 6 + 10=16. No. 24 - (30/10) =24-3=21. Not 8. 30/10=3, 24-3=21. Still not. Maybe 24/(30/10)=24/3=8. Oh, that's possible! Let me check. 30 divided by 10 is 3, then 24 divided by 3 is 8. So then 35 + 8 =43. So putting it all together: 35 + (24 / (30 /10)) =35 + (24/3)=35+8=43. Let me verify each number is used once: 35, 24, 30, 10. Yes, all are used once. Operations are division inside the parentheses and then addition. That should work. Let me write it out: 35 + 24/(30/10) =43. Let me check the calculation again. 30 divided by 10 is 3, then 24 divided by 3 is 8. 35 plus 8 is 43. Exactly. So that's the equation. Wait, is there another way? Maybe, but this seems to work. Let me make sure there's no error here. All numbers used once, operations allowed. Yep, looks good.\n",
      "</think>\n",
      "\n",
      "<answer>35 + 24 / (30 / 10) = 43</answer><|im_end|>\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(dataset[99][\"text\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[34m\u001b[1mwandb\u001b[0m: \u001b[32m\u001b[41mERROR\u001b[0m Failed to detect the name of this notebook. You can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.\n",
      "\u001b[34m\u001b[1mwandb\u001b[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.\n",
      "\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33mswulling\u001b[0m to \u001b[32mhttps://api.wandb.ai\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "Tracking run with wandb version 0.19.9"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "Run data is saved locally in <code>/data/countdown/wandb/run-20250422_140204-fgqin0fq</code>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "Syncing run <strong><a href='https://wandb.ai/swulling/countdown-sft-distill/runs/fgqin0fq' target=\"_blank\">wandering-sun-6</a></strong> to <a href='https://wandb.ai/swulling/countdown-sft-distill' target=\"_blank\">Weights & Biases</a> (<a href='https://wandb.me/developer-guide' target=\"_blank\">docs</a>)<br>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       " View project at <a href='https://wandb.ai/swulling/countdown-sft-distill' target=\"_blank\">https://wandb.ai/swulling/countdown-sft-distill</a>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       " View run at <a href='https://wandb.ai/swulling/countdown-sft-distill/runs/fgqin0fq' target=\"_blank\">https://wandb.ai/swulling/countdown-sft-distill/runs/fgqin0fq</a>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<button onClick=\"this.nextSibling.style.display='block';this.style.display='none';\">Display W&B run</button><iframe src='https://wandb.ai/swulling/countdown-sft-distill/runs/fgqin0fq?jupyter=true' style='border:none;width:100%;height:420px;display:none;'></iframe>"
      ],
      "text/plain": [
       "<wandb.sdk.wandb_run.Run at 0x7f502260ab60>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import wandb\n",
    "wandb.init(project=\"countdown-sft-distill\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "96509deb78af443f89f5b5d9fbb80322",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Unsloth: Tokenizing [\"text\"] (num_proc=12):   0%|          | 0/712 [00:00<?, ? examples/s]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from trl import SFTTrainer, SFTConfig\n",
    "trainer = SFTTrainer(\n",
    "    model = model,\n",
    "    tokenizer = tokenizer,\n",
    "    train_dataset = dataset,\n",
    "    eval_dataset = None, # Can set up evaluation!\n",
    "    args = SFTConfig(\n",
    "        dataset_text_field = \"text\",\n",
    "        max_seq_length = max_seq_length, # 最大长度\n",
    "        per_device_train_batch_size = 8,\n",
    "        gradient_accumulation_steps = 1, # Use GA to mimic batch size!\n",
    "        warmup_ratio = 0.1,\n",
    "        num_train_epochs = 3,\n",
    "        learning_rate = 1e-4, # Reduce to 2e-5 for long training runs\n",
    "        logging_steps = 1,\n",
    "        optim = \"adamw_8bit\",\n",
    "        weight_decay = 0.01,\n",
    "        lr_scheduler_type = \"cosine\",\n",
    "        seed = 3407,\n",
    "        report_to = \"wandb\", # Use this for WandB etc\n",
    "    ),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<|im_start|>system\n",
      "You are a helpful assistant. You first thinks about the reasoning process in the mind and then provides the user with the answer.<|im_end|>\n",
      "<|im_start|>user\n",
      "Using the numbers 30, 24, 10, 35, create an equation that equals 43. You can use basic arithmetic operations (+, -, *, /) one or multiple times but each number can only be used once, and you must use all the numbers. Show your work in <think> </think> tags. And return the final equation in <answer> </answer> tags, for example <answer>(1 + 2) / 3</answer>. Think step by step inside <think> tags.<|im_end|>\n",
      "<|im_start|>assistant\n",
      "<think>Okay, let's see. I need to make an equation using all the numbers 30, 24, 10, and 35 exactly once, combining them with basic arithmetic operations (+, -, *, /) to get 43. Hmm, let's start by thinking about possible combinations.\n",
      "\n",
      "First, 43 is a prime number, so maybe addition or subtraction could be part of the solution. Let's look at the numbers: 30, 24, 10, 35. The largest number is 35, so maybe if I subtract something from 35 to get closer to 43. Wait, 35 is smaller than 43, so that's not helpful. Alternatively, adding 35 and 10 gives 45, which is close to 43. Then maybe subtract 2? But I don't have a 2. Hmm. Alternatively, 35 + (something) =43. So 43-35=8. So I need to make 8 with 30, 24, and 10. But that's three numbers, and I can only use each once. Let's see: 30, 24, 10. How can I combine these to get 8? 30 - 24 =6, then 6 + 10=16. No. 24 - (30/10) =24-3=21. Not 8. 30/10=3, 24-3=21. Still not. Maybe 24/(30/10)=24/3=8. Oh, that's possible! Let me check. 30 divided by 10 is 3, then 24 divided by 3 is 8. So then 35 + 8 =43. So putting it all together: 35 + (24 / (30 /10)) =35 + (24/3)=35+8=43. Let me verify each number is used once: 35, 24, 30, 10. Yes, all are used once. Operations are division inside the parentheses and then addition. That should work. Let me write it out: 35 + 24/(30/10) =43. Let me check the calculation again. 30 divided by 10 is 3, then 24 divided by 3 is 8. 35 plus 8 is 43. Exactly. So that's the equation. Wait, is there another way? Maybe, but this seems to work. Let me make sure there's no error here. All numbers used once, operations allowed. Yep, looks good.\n",
      "</think>\n",
      "\n",
      "<answer>35 + 24 / (30 / 10) = 43</answer><|im_end|>\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(tokenizer.decode(trainer.train_dataset[99][\"input_ids\"]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# todo: train completions only"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1\n",
      "   \\\\   /|    Num examples = 712 | Num Epochs = 3 | Total steps = 267\n",
      "O^O/ \\_/ \\    Batch size per device = 8 | Gradient accumulation steps = 1\n",
      "\\        /    Data Parallel GPUs = 1 | Total batch size (8 x 1 x 1) = 8\n",
      " \"-____-\"     Trainable parameters = 73,859,072/1,617,573,376 (4.57% trained)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "    <div>\n",
       "      \n",
       "      <progress value='267' max='267' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
       "      [267/267 10:12, Epoch 3/3]\n",
       "    </div>\n",
       "    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       " <tr style=\"text-align: left;\">\n",
       "      <th>Step</th>\n",
       "      <th>Training Loss</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>1.271100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>1.236200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>1.289900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>1.269100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>5</td>\n",
       "      <td>1.316000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>6</td>\n",
       "      <td>1.204100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>7</td>\n",
       "      <td>1.190500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>8</td>\n",
       "      <td>1.163100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>9</td>\n",
       "      <td>1.132000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>10</td>\n",
       "      <td>1.044300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>11</td>\n",
       "      <td>1.023000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>12</td>\n",
       "      <td>1.014500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>13</td>\n",
       "      <td>0.908000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>14</td>\n",
       "      <td>0.986800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>15</td>\n",
       "      <td>0.941400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>16</td>\n",
       "      <td>0.868400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>17</td>\n",
       "      <td>0.828100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>18</td>\n",
       "      <td>0.862400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>19</td>\n",
       "      <td>0.815700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>20</td>\n",
       "      <td>0.744100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>21</td>\n",
       "      <td>0.734600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>22</td>\n",
       "      <td>0.811700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>23</td>\n",
       "      <td>0.741400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>24</td>\n",
       "      <td>0.767400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>25</td>\n",
       "      <td>0.726600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>26</td>\n",
       "      <td>0.680000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>27</td>\n",
       "      <td>0.654900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>28</td>\n",
       "      <td>0.692900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>29</td>\n",
       "      <td>0.627000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>30</td>\n",
       "      <td>0.685400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>31</td>\n",
       "      <td>0.685900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>32</td>\n",
       "      <td>0.629900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>33</td>\n",
       "      <td>0.667800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>34</td>\n",
       "      <td>0.628400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>35</td>\n",
       "      <td>0.633500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>36</td>\n",
       "      <td>0.546000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>37</td>\n",
       "      <td>0.596200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>38</td>\n",
       "      <td>0.611100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>39</td>\n",
       "      <td>0.632100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>40</td>\n",
       "      <td>0.618400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>41</td>\n",
       "      <td>0.656000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>42</td>\n",
       "      <td>0.621900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>43</td>\n",
       "      <td>0.586300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>44</td>\n",
       "      <td>0.654700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>45</td>\n",
       "      <td>0.569500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>46</td>\n",
       "      <td>0.607000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>47</td>\n",
       "      <td>0.625800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>48</td>\n",
       "      <td>0.536300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>49</td>\n",
       "      <td>0.650800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>50</td>\n",
       "      <td>0.595000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>51</td>\n",
       "      <td>0.589200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>52</td>\n",
       "      <td>0.601800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>53</td>\n",
       "      <td>0.561700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>54</td>\n",
       "      <td>0.622300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>55</td>\n",
       "      <td>0.514200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>56</td>\n",
       "      <td>0.585500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>57</td>\n",
       "      <td>0.547400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>58</td>\n",
       "      <td>0.622800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>59</td>\n",
       "      <td>0.602300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>60</td>\n",
       "      <td>0.580300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>61</td>\n",
       "      <td>0.576300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>62</td>\n",
       "      <td>0.606300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>63</td>\n",
       "      <td>0.614600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>64</td>\n",
       "      <td>0.572700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>65</td>\n",
       "      <td>0.626500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>66</td>\n",
       "      <td>0.556800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>67</td>\n",
       "      <td>0.587200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>68</td>\n",
       "      <td>0.580500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>69</td>\n",
       "      <td>0.544700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>70</td>\n",
       "      <td>0.557300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>71</td>\n",
       "      <td>0.556700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>72</td>\n",
       "      <td>0.538900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>73</td>\n",
       "      <td>0.604900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>74</td>\n",
       "      <td>0.553800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>75</td>\n",
       "      <td>0.515900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>76</td>\n",
       "      <td>0.544200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>77</td>\n",
       "      <td>0.551000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>78</td>\n",
       "      <td>0.542100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>79</td>\n",
       "      <td>0.529600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>80</td>\n",
       "      <td>0.552100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>81</td>\n",
       "      <td>0.556000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>82</td>\n",
       "      <td>0.584800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>83</td>\n",
       "      <td>0.537700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>84</td>\n",
       "      <td>0.524300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>85</td>\n",
       "      <td>0.501300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>86</td>\n",
       "      <td>0.543600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>87</td>\n",
       "      <td>0.537300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>88</td>\n",
       "      <td>0.533000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>89</td>\n",
       "      <td>0.570800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>90</td>\n",
       "      <td>0.527300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>91</td>\n",
       "      <td>0.520000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>92</td>\n",
       "      <td>0.469700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>93</td>\n",
       "      <td>0.477500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>94</td>\n",
       "      <td>0.523900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>95</td>\n",
       "      <td>0.527000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>96</td>\n",
       "      <td>0.506500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>97</td>\n",
       "      <td>0.470300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>98</td>\n",
       "      <td>0.524700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>99</td>\n",
       "      <td>0.508300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>100</td>\n",
       "      <td>0.543600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>101</td>\n",
       "      <td>0.506700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>102</td>\n",
       "      <td>0.499300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>103</td>\n",
       "      <td>0.511200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>104</td>\n",
       "      <td>0.457100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>105</td>\n",
       "      <td>0.509600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>106</td>\n",
       "      <td>0.507300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>107</td>\n",
       "      <td>0.511100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>108</td>\n",
       "      <td>0.515900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>109</td>\n",
       "      <td>0.495600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>110</td>\n",
       "      <td>0.481900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>111</td>\n",
       "      <td>0.549200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>112</td>\n",
       "      <td>0.515500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>113</td>\n",
       "      <td>0.492600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>114</td>\n",
       "      <td>0.528900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>115</td>\n",
       "      <td>0.467500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>116</td>\n",
       "      <td>0.510800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>117</td>\n",
       "      <td>0.497900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>118</td>\n",
       "      <td>0.521700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>119</td>\n",
       "      <td>0.518400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>120</td>\n",
       "      <td>0.517400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>121</td>\n",
       "      <td>0.494400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>122</td>\n",
       "      <td>0.493500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>123</td>\n",
       "      <td>0.486300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>124</td>\n",
       "      <td>0.484100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>125</td>\n",
       "      <td>0.466400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>126</td>\n",
       "      <td>0.507000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>127</td>\n",
       "      <td>0.497100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>128</td>\n",
       "      <td>0.519400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>129</td>\n",
       "      <td>0.552900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>130</td>\n",
       "      <td>0.520400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>131</td>\n",
       "      <td>0.558100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>132</td>\n",
       "      <td>0.504200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>133</td>\n",
       "      <td>0.476300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>134</td>\n",
       "      <td>0.522700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>135</td>\n",
       "      <td>0.475800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>136</td>\n",
       "      <td>0.501300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>137</td>\n",
       "      <td>0.504500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>138</td>\n",
       "      <td>0.513500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>139</td>\n",
       "      <td>0.499300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>140</td>\n",
       "      <td>0.478200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>141</td>\n",
       "      <td>0.490700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>142</td>\n",
       "      <td>0.506900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>143</td>\n",
       "      <td>0.480000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>144</td>\n",
       "      <td>0.478600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>145</td>\n",
       "      <td>0.481700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>146</td>\n",
       "      <td>0.453800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>147</td>\n",
       "      <td>0.509000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>148</td>\n",
       "      <td>0.387900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>149</td>\n",
       "      <td>0.470200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>150</td>\n",
       "      <td>0.512400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>151</td>\n",
       "      <td>0.442700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>152</td>\n",
       "      <td>0.546600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>153</td>\n",
       "      <td>0.485900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>154</td>\n",
       "      <td>0.484600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>155</td>\n",
       "      <td>0.471100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>156</td>\n",
       "      <td>0.460600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>157</td>\n",
       "      <td>0.479100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>158</td>\n",
       "      <td>0.495300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>159</td>\n",
       "      <td>0.496900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>160</td>\n",
       "      <td>0.510200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>161</td>\n",
       "      <td>0.524200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>162</td>\n",
       "      <td>0.522800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>163</td>\n",
       "      <td>0.448600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>164</td>\n",
       "      <td>0.523300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>165</td>\n",
       "      <td>0.472300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>166</td>\n",
       "      <td>0.474700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>167</td>\n",
       "      <td>0.484300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>168</td>\n",
       "      <td>0.502400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>169</td>\n",
       "      <td>0.479900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>170</td>\n",
       "      <td>0.458200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>171</td>\n",
       "      <td>0.499500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>172</td>\n",
       "      <td>0.467400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>173</td>\n",
       "      <td>0.443200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>174</td>\n",
       "      <td>0.515200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>175</td>\n",
       "      <td>0.477800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>176</td>\n",
       "      <td>0.458200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>177</td>\n",
       "      <td>0.482800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>178</td>\n",
       "      <td>0.446100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>179</td>\n",
       "      <td>0.431500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>180</td>\n",
       "      <td>0.445500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>181</td>\n",
       "      <td>0.407400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>182</td>\n",
       "      <td>0.458300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>183</td>\n",
       "      <td>0.461800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>184</td>\n",
       "      <td>0.454700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>185</td>\n",
       "      <td>0.424700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>186</td>\n",
       "      <td>0.450000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>187</td>\n",
       "      <td>0.464600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>188</td>\n",
       "      <td>0.461500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>189</td>\n",
       "      <td>0.485400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>190</td>\n",
       "      <td>0.493500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>191</td>\n",
       "      <td>0.429300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>192</td>\n",
       "      <td>0.444000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>193</td>\n",
       "      <td>0.467000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>194</td>\n",
       "      <td>0.451500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>195</td>\n",
       "      <td>0.426000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>196</td>\n",
       "      <td>0.398200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>197</td>\n",
       "      <td>0.414300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>198</td>\n",
       "      <td>0.473600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>199</td>\n",
       "      <td>0.505800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>200</td>\n",
       "      <td>0.454200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>201</td>\n",
       "      <td>0.437100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>202</td>\n",
       "      <td>0.413700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>203</td>\n",
       "      <td>0.419500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>204</td>\n",
       "      <td>0.432400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>205</td>\n",
       "      <td>0.510300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>206</td>\n",
       "      <td>0.491900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>207</td>\n",
       "      <td>0.396900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>208</td>\n",
       "      <td>0.433900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>209</td>\n",
       "      <td>0.447900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>210</td>\n",
       "      <td>0.455300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>211</td>\n",
       "      <td>0.449200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>212</td>\n",
       "      <td>0.463700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>213</td>\n",
       "      <td>0.471100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>214</td>\n",
       "      <td>0.453800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>215</td>\n",
       "      <td>0.431200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>216</td>\n",
       "      <td>0.419200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>217</td>\n",
       "      <td>0.428400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>218</td>\n",
       "      <td>0.433600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>219</td>\n",
       "      <td>0.438900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>220</td>\n",
       "      <td>0.457000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>221</td>\n",
       "      <td>0.471700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>222</td>\n",
       "      <td>0.428500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>223</td>\n",
       "      <td>0.456900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>224</td>\n",
       "      <td>0.449000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>225</td>\n",
       "      <td>0.441100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>226</td>\n",
       "      <td>0.476600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>227</td>\n",
       "      <td>0.493100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>228</td>\n",
       "      <td>0.437600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>229</td>\n",
       "      <td>0.488900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>230</td>\n",
       "      <td>0.434800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>231</td>\n",
       "      <td>0.461800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>232</td>\n",
       "      <td>0.454200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>233</td>\n",
       "      <td>0.404400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>234</td>\n",
       "      <td>0.452400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>235</td>\n",
       "      <td>0.430200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>236</td>\n",
       "      <td>0.430900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>237</td>\n",
       "      <td>0.465600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>238</td>\n",
       "      <td>0.449000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>239</td>\n",
       "      <td>0.406600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>240</td>\n",
       "      <td>0.466300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>241</td>\n",
       "      <td>0.425900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>242</td>\n",
       "      <td>0.461900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>243</td>\n",
       "      <td>0.441300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>244</td>\n",
       "      <td>0.447100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>245</td>\n",
       "      <td>0.454800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>246</td>\n",
       "      <td>0.457600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>247</td>\n",
       "      <td>0.426700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>248</td>\n",
       "      <td>0.422700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>249</td>\n",
       "      <td>0.408000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>250</td>\n",
       "      <td>0.446700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>251</td>\n",
       "      <td>0.444300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>252</td>\n",
       "      <td>0.464600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>253</td>\n",
       "      <td>0.457400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>254</td>\n",
       "      <td>0.481000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>255</td>\n",
       "      <td>0.439500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>256</td>\n",
       "      <td>0.455000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>257</td>\n",
       "      <td>0.462600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>258</td>\n",
       "      <td>0.462900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>259</td>\n",
       "      <td>0.446100</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>260</td>\n",
       "      <td>0.432900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>261</td>\n",
       "      <td>0.448700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>262</td>\n",
       "      <td>0.462000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>263</td>\n",
       "      <td>0.457200</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>264</td>\n",
       "      <td>0.422300</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>265</td>\n",
       "      <td>0.433600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>266</td>\n",
       "      <td>0.445900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>267</td>\n",
       "      <td>0.417000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table><p>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Unsloth: Will smartly offload gradients to save VRAM!\n"
     ]
    }
   ],
   "source": [
    "trainer_stats = trainer.train()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "wandb link: https://wandb.ai/swulling/countdown-sft-distill/runs/aaenr5bf?nw=nwuserswulling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('output/qwen2.5-1.5b-sft-distill-lora/tokenizer_config.json',\n",
       " 'output/qwen2.5-1.5b-sft-distill-lora/special_tokens_map.json',\n",
       " 'output/qwen2.5-1.5b-sft-distill-lora/vocab.json',\n",
       " 'output/qwen2.5-1.5b-sft-distill-lora/merges.txt',\n",
       " 'output/qwen2.5-1.5b-sft-distill-lora/added_tokens.json',\n",
       " 'output/qwen2.5-1.5b-sft-distill-lora/tokenizer.json')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.save_pretrained(\"output/qwen2.5-1.5b-sft-distill-lora\")  # Local saving lora weights\n",
    "tokenizer.save_pretrained(\"output/qwen2.5-1.5b-sft-distill-lora\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "vllm inference\n",
    "```bash\n",
    "vllm serve Qwen/Qwen2.5-1.5B-Instruct --port 8100 --api-key NLUKKXIJDZ91rpg1z --enforce-eager  --max-model-len 10000 --enable-lora --max-lora-rank 64 --lora-modules qwen2.5-1.5b-sft-distill-lora=output/qwen2.5-1.5b-sft-distill-lora\n",
    "\n",
    "CURATOR_VIEWER=1 python eval.py --provider vllm --data_path data/test.jsonl --model_name qwen2.5-1.5b-sft-distill-lora --temperature 0.01 --max_tokens 8192\n",
    "\n",
    "https://curator.bespokelabs.ai/datasets/38a2fef9c1614b0089c12ea38d2248e6    \n",
    "\n",
    "Accuracy: 44/100 (44.00%)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "分析：\n",
    "\n",
    "1. 格式上 SFT 后完全遵守，没有问题\n",
    "2. 相比于简单的 SFT，因为增加大量思考过程，还是会出现数字使用错误（包括没有使用或者使用超过一次）\n",
    "3. 正确率有显著提升，证明模型确实学习到了推理过程\n",
    "4. 大量错误都是因为无法找到正确答案超过最长 tokens 限制。\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
